Search CORE

142 research outputs found

The distribution of amorphous computer outputs

Author: Langdon W.B.
Publication venue
Publication date: 18/04/2005
Field of study

Fitness distributions (landscapes) of programs tend to a limit as they get bigger. Markov minorization gives upper bounds ((15.3 + 2.30m)/ log I) on the length of program run on random or average computing devices. I is the size of the instruction set and m size of output register. Almost all programs are constants. Convergence is exponential with 90% of programs of length 1.6 n2N yielding constants (n = size input register and size of memory = N). This is supported by experiment

UCL Discovery

Repeated patterns in tree genetic programming

Author: Banzhaf W.
Langdon W.B.
Publication venue: Springer-Verlag GmbH
Publication date: 01/01/2005
Field of study

We extend our analysis of repetitive patterns found in genetic programming genomes to tree based GP. As in linear GP, repetitive patterns are present in large numbers. Size fair crossover limits bloat in automatic programming, preventing the evolution of recurring motifs. We examine these complex properties in detail: e.g. using depth v. size Catalan binary tree shape plots, subgraph and subtree matching, information entropy, syntactic and semantic fitness correlations and diffuse introns. We relate this emergent phenomenon to considerations about building blocks in GP and how GP works

UCL Discovery

Genetic Programming and Evolvable Machines: five years of reviews

Author: Gustafson S.
Langdon W.B.
Publication venue
Publication date: 01/01/2005
Field of study

UCL Discovery

Genetic programming in data mining for drug discovery

Author: Barrett S.J.
Langdon W.B.
Publication venue: Springer-Verlag Berlin and Heidelberg GmbH & Co. K
Publication date: 01/01/2004
Field of study

Genetic programming (GP) is used to extract from rat oral bioavailability (OB) measurements simple, interpretable and predictive QSAR models which both generalise to rats and to marketed drugs in humans. Receiver Operating Characteristics (ROC) curves for the binary classier produced by machine learning show no statistical dierence between rats (albeit without known clearance dierences) and man. Thus evolutionary computing oers the prospect of in silico ADME screening, e.g. for \virtual" chemicals, for pharmaceutical drug discovery

CiteSeerX

UCL Discovery

Repeated sequences in linear genetic programming genomes

Author: Banzhaf W.
Langdon W.B.
Publication venue
Publication date: 01/01/2005
Field of study

Biological chromosomes are replete with repetitive sequences, micro satellites, SSR tracts, ALU, etc. in their DNA base sequences. We started looking for similar phenomena in evolutionary computation. First studies find copious repeated sequences, which can be hierarchically decomposed into shorter sequences, in programs evolved using both homologous and two point crossover but not with headless chicken crossover or other mutations. In bloated programs the small number of effective or expressed instructions appear in both repeated and nonrepeated code. Hinting that building-blocks or code reuse may evolve in unplanned ways. Mackey-Glass chaotic time series prediction and eukaryotic protein localisation (both previously used as artificial intelligence machine learning benchmarks) demonstrate evolution of Shannon information (entropy) and lead to models capable of lossy Kolmogorov compression. Our findings with diverse benchmarks and GP systems suggest this emergent phenomenon may be widespread in genetic systems

UCL Discovery

Repeated Patterns in Tree Genetic Programming

Author: A. Reinhardt
A.F.A. Smit
C. Patience
C.E. Shannon
G. Achaz
G. Toth
H. Oakley
J.R. Koza
J.R. Lupski
R. Sedgewick
R.J. Britten
U.M. O’Reilly
W.B. Langdon
W.B. Langdon
W.B. Langdon
W.B. Langdon
W.B. Langdon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref

Genetic Improvement of computational biology software

Author: Barrett T.
Langdon W.B.
McClintock B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/07/2017
Field of study

There is a cultural divide between computer scientists and biologists that needs to be addressed. The two disciplines used to be quite unrelated but many new research areas have arisen from their synergy. We selectively review two multi-disciplinary problems: dealing with contamination in sequencing data repositories and improving software using biology inspired evolutionary computing. Through several examples, we show that ideas from biology may result in optimised code and provide surprising improvements that overcome challenges in speed and quality trade-offs. On the other hand, development of computational methods is essential for maintaining contamination free databases. Computer scientists and biologists must always be sceptical of each others data, just as they would be of their own

Crossref

UCL Discovery

Inc*: an incremental approach for improving local search heuristics

Author: A.S. Fukunaga
B. Selman
E. Marchiori
H. Han
H. Zeng
J. Gottlieb
J.R. Koza
M. Davis
M.B. Bader-El-Den
O. Fourdrinoy
R.H. Kibria
S. Minton
S.A. Cook
W.B. Langdon
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2008
Field of study

Crossref

Portsmouth University Research Portal (Pure)

Evolving text classification rules with genetic programming

Author: Anthony N.
Ebert D.
Hirsch L.
Joachims T.
Karanikas H.
Koza J. R.
Koza J. R.
Langdon W.B.
Laurence Hirsch
Lodhi H.
Masoud Saeedi
Montana D
Robin Hirsch
Salton G.
Van Rijsbergen C. J.
Publication venue: 'Informa UK Limited'
Publication date: 07/09/2005
Field of study

We describe a novel method for using genetic programming to create compact classification rules using combinations of N-grams (character strings). Genetic programs acquire fitness by producing rules that are effective classifiers in terms of precision and recall when evaluated against a set of training documents. We describe a set of functions and terminals and provide results from a classification task using the Reuters 21578 dataset. We also suggest that the rules may have a number of other uses beyond classification and provide a basis for text mining applications

Crossref

Sheffield Hallam University Research Archive

Memory with memory in genetic programming

Author: A. Teller
L. Spector
L. Spector
N.F. McPhee
P.J. Angeline
S. Brave
S.J. Luck
W. Banzhaf
W.B. Langdon
W.S. Bruce
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2009
Field of study

We introduce Memory with Memory Genetic Programming (MwM-GP), where we use soft assignments and soft return operations. Instead of having the new value completely overwrite the old value of registers or memory, soft assignments combine such values. Similarly, in soft return operations the value of a function node is a blend between the result of a calculation and previously returned results. In extensive empirical tests, MwM-GP almost always does as well as traditional GP, while significantly outperforming it in several cases. MwM-GP also tends to be far more consistent than traditional GP. The data suggest that MwM-GP works by successively refining an approximate solution to the target problem and that it is much less likely to have truly ineffective code. MwM-GP can continue to improve over time, but it is less likely to get the sort of exact solution that one might find with traditional GP

University of Essex Research Repository

Crossref